Bootstrapped Training of Event Extraction Classifiers

نویسندگان

  • Ruihong Huang
  • Ellen Riloff
چکیده

Most event extraction systems are trained with supervised learning and rely on a collection of annotated documents. Due to the domain-specificity of this task, event extraction systems must be retrained with new annotated data for each domain. In this paper, we propose a bootstrapping solution for event role filler extraction that requires minimal human supervision. We aim to rapidly train a state-of-the-art event extraction system using a small set of “seed nouns” for each event role, a collection of relevant (in-domain) and irrelevant (outof-domain) texts, and a semantic dictionary. The experimental results show that the bootstrapped system outperforms previous weakly supervised event extraction systems on the MUC-4 data set, and achieves performance levels comparable to supervised training with 700 manually annotated documents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed Representations of Words to Guide Bootstrapped Entity Classifiers

Bootstrapped classifiers iteratively generalize from a few seed examples or prototypes to other examples of target labels. However, sparseness of language and limited supervision make the task difficult. We address this problem by using distributed vector representations of words to aid the generalization. We use the word vectors to expand entity sets used for training classifiers in a bootstra...

متن کامل

External Evaluation of Event Extraction Classifiers for Automatic Pathway Curation: An extended study of the mTOR pathway

This paper evaluates the impact of various event extraction systems on automatic pathway curation using the popular mTOR pathway. We quantify the impact of training data sets as well as different machine learning classifiers and show that some improve the quality of automatically extracted pathways.

متن کامل

Unsupervised Extraction of Prosodic Structure

Our approach for unsupervised extraction of prosodic structure in spontaneous speech consists of the four steps: chunking into interpausal units, syllable nucleus extraction, prosodic boundary detection, and pitch accent detection. The extraction is based on acoustic features derived from F0 parameterization, and on energy and segment duration features. Phrase boundaries and accents are detecte...

متن کامل

Overview of UI-CCG Systems for Event Argument Extraction, Entity Discovery and Linking, and Slot Filler Validation

In this paper, we describe the University of Illinois (UI CCG) submission to the 2013 TAC KBP Event Argument Extraction (EAE), English Entity Discovery and Linking (EDL), and Slot Filler Validation (SFV) tasks. We developed three separate systems. Our Event Argument Recognition system infers world knowledge from event argument overlaps to improve performance of a recognition/labeling pipeline. ...

متن کامل

Bootstrapped Self Training for Knowledge Base Population

A central challenge in relation extraction is the lack of supervised training data. Pattern-based relation extractors suffer from low recall, whereas distant supervision yields noisy data which hurts precision. We propose bootstrapped selftraining to capture the benefits of both systems: the precision of patterns and the generalizability of trained models. We show that training on the output of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012